Image Processing Method and Apparatus

ABSTRACT

An image processing technique includes acquiring a main image of a scene and determining one or more facial regions in the main image. The facial regions are analysed to determine if any of the facial regions includes a defect. A sequence of relatively low resolution images nominally of the same scene is also acquired. One or more sets of low resolution facial regions in the sequence of low resolution images are determined and analysed for defects. Defect free facial regions of a set are combined to provide a high quality defect free facial region. At least a portion of any defective facial regions of the main image are corrected with image information from a corresponding high quality defect free facial region.

RELATED APPLICATIONS

This application is a Continuation of U.S. patent application Ser. No.13/947,095 filed on Jul. 21, 2013, now U.S. Pat. No. 9,025,837; which isa Continuation of U.S. patent application Ser. No. 13/103,077 filed onMay 8, 2011, now U.S. Pat. No. 8,515,138; which is a Continuation ofU.S. patent application Ser. No. 13/034,707 filed on Feb. 25, 2011, nowU.S. Pat. No. 8,494,232; which is a Continuation of U.S. patentapplication Ser. No. 11/752,925 filed on May 24, 2007, now U.S. Pat. No.7,916,971; all of which are hereby incorporated by reference in itsentirety.

BACKGROUND

The present invention relates to an image processing method andapparatus. One of the most common reasons for an acquired digitalphotograph to be discarded or spoiled is because one or more of thefacial regions in the photograph suffer from photographic defects otherthan red-eye defects, even though red eye defects can be common incameras not operating with the advantages of the techniques described,e.g., at U.S. Pat. No. 6,407,777, and at US published applications nos.2005/0140801, 2005/0041121, 2006/0093212, and 2006/0204054, which areassigned to the same assignee and hereby incorporated by reference.Common examples occur when people move or shake their head; when someonecloses their eyes or blinks or someone yawns. Where there are severalfaces in a photograph, it is sufficient for one face to be “defective”for the whole shot to be spoiled. Although digital cameras allow usersto quickly shoot several pictures of the same scene, typically, suchcameras do not provide warnings of facial errors, nor provide a way tocorrect for such errors without repeating the composition stages (i.e.getting everyone together again in a group) of taking the photograph andre-shooting the scene. This type of problem is particularly difficultwith children who are often photographed in unusual spontaneous poseswhich cannot be duplicated. When such a shot is spoiled because thechild moved their head at the moment of acquisition, it is verydisappointing for the photographer.

U.S. Pat. No. 6,301,440, which is incorporated by reference, disclosesan image acquisition device wherein the instant of exposure iscontrolled by image content. When a trigger is activated, the imageproposed by the user is analysed and imaging parameters are altered toobtain optimum image quality before the device proceeds to take theimage. For example, the device could postpone acquisition of the imageuntil every person in the image is smiling.

SUMMARY OF THE INVENTION

An image processing method is provided including acquiring a main imageof a scene. One or more facial regions are determined in the main image.The one or more main image facial regions are analyzed for defects andone or more are determined to be defective. A sequence of relatively lowresolution images nominally of the scene are acquired. One or more setsof low resolution facial regions in the sequence are analyzed todetermine one or more that correspond to a defective main image facialregion. At least a portion of the defective main image facial region iscorrected with image information from one or more corresponding lowresolution facial regions not including a same defect as said portion ofsaid defective main image facial region.

The sequence of low resolution images may be specifically acquired for atime period not including a time for acquiring the main image. Themethod may also include combining defect-free low resolution facialregions into a combined image, and correcting at least the portion ofthe defective main image facial region with image information from thecombined image.

Another image processing method is provided that includes acquiring amain image of a scene. One or more facial regions in the main image aredetermined, and analyzed to determine if any are defective. A sequenceof relatively low resolution images is acquired nominally of the scenefor a time period not including a time for acquiring the main image. Oneor more sets of low resolution facial regions are determined in thesequence of low resolution images. The sets of facial regions areanalyzed to determine if any facial regions of a set corresponding to adefective facial region of the main image include a defect. Defect freefacial regions of the corresponding set are combined to provide a highquality defect free facial region. At least a portion of any defectivefacial regions of said main image are corrected with image informationfrom a corresponding high quality defect free facial region.

The time period may include one or more of a time period preceding or atime period following the time for acquiring the main image. Thecorrecting may include applying a model including multiple verticesdefining a periphery of a facial region to each high quality defect-freefacial region and a corresponding defective facial region. Pixels may bemapped of the high quality defect-free facial region to the defectivefacial region according to the correspondence of vertices for therespective regions. The model may include an Active Appearance Model(AAM).

The main image may be acquired at an exposure level different to theexposure level of the low resolution images. The correcting may includemapping luminance levels of the high quality defect free facial regionto luminance levels of the defective facial region.

Sets of low resolution facial regions from the sequence of lowresolution images may be stored in an image header file of the mainimage.

The method may include displaying the main image and/or corrected image,and selected actions may be user-initiated.

The analyzing of the sets may include, prior to the combining in thesecond method, removing facial regions including faces exceeding anaverage size of faces in a set of facial regions by a threshold amountfrom said set of facial regions, and/or removing facial regionsincluding faces with an orientation outside an average orientation offaces in a set of facial regions by a threshold amount from said set offacial regions.

The analyzing of sets may include the following: applying an ActiveAppearance Model (AAM) to each face of a set of facial regions;analyzing AAM parameters for each face of the set of facial regions toprovide an indication of facial expression; and prior to the combiningin the second method, removing faces having a defective expression fromthe set of facial regions.

The analyzing of sets may include the following: applying an ActiveAppearance Model (AAM) to each face of a set of facial regions;analysing AAM parameters for each face of the set of facial regions toprovide an indication of facial orientation; and prior to said combiningin the second method, removing faces having an undesirable orientationfrom said set of facial regions.

The analyzing of facial regions may include applying an ActiveAppearance Model (AAM) to each facial region, and analyzing AAMparameters for each facial region to provide an indication of facialexpression, and/or analyzing each facial region for contrast, sharpness,texture, luminance levels or skin color or combinations thereof, and/oranalyzing each facial region to determine if an eye of the facial regionis closed, if a mouth of the facial region is open and/or if a mouth ofthe facial region is smiling.

The method may be such that the correcting, and the combining in thesecond method, only occur when the set of facial regions exceeds a givennumber. The method may also include resizing and aligning faces of theset of facial regions, and the aligning may be performed according tocardinal points of faces of the set of facial regions.

The correcting may include blending and/or infilling a corrected regionof the main image with the remainder of the main image.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments will now be described, by way of example, with reference tothe accompanying drawings, in which:

FIG. 1 is a block diagram of an image processing apparatus operating inaccordance with an embodiment of the present invention;

FIG. 2 is a flow diagram of an image processing method according to apreferred embodiment of the present invention; and

FIGS. 3 and 4 show exemplary sets of images to which an activeappearance model has been applied.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Certain embodiments can be implemented with a digital camera whichincorporates (i) a face tracker operative on a preview image stream;(ii) a super-resolution processing module configured to create a higherresolution image from a composite of several low-resolution images; and(iii) a facial region quality analysis module for determining thequality of facial regions.

Preferably, super-resolution is applied to preview facial regionsextracted during face tracking.

The embodiments enable the correction of errors or flaws in the facialregions of an acquired image within a digital camera using preview imagedata and employing super-resolution techniques.

FIG. 1 is a block diagram of an image acquisition device 20, which inthe present embodiment is a portable digital camera, operating inaccordance with certain embodiments. It will be appreciated that many ofthe processes implemented in the digital camera are implemented in orcontrolled by software operating on a microprocessor, central processingunit, controller, digital signal processor and/or an applicationspecific integrated circuit, collectively depicted as processor 120. Alluser interface and control of peripheral components such as buttons anddisplay is controlled by a microcontroller 122.

In operation, the processor 120, in response to a user input at 122,such as half pressing a shutter button (pre-capture mode 32), initiatesand controls the digital photographic process. Ambient light exposure isdetermined using a light sensor 40 in order to automatically determineif a flash is to be used. The distance to the subject is determinedusing a focusing mechanism 50 which also focuses the image on an imagecapture device 60. If a flash is to be used, processor 120 causes aflash device 70 to generate a photographic flash in substantialcoincidence with the recording of the image by the image capture device60 upon full depression of the shutter button.

The image capture device 60 digitally records the image in colour. Theimage capture device is known to those familiar with the art and mayinclude a CCD (charge coupled device) or CMOS to facilitate digitalrecording. The flash may be selectively generated either in response tothe light sensor 40 or a manual input 72 from the user of the camera.The high resolution image recorded by image capture device 60 is storedin an image store 80 which may comprise computer memory such a dynamicrandom access memory or a non-volatile memory. The camera is equippedwith a display 100, such as an LCD, both for displaying preview imagesand displaying a user interface for camera control software.

In the case of preview images which are generated in the pre-capturemode 32 with the shutter button half-pressed, the display 100 can assistthe user in composing the image, as well as being used to determinefocus and exposure. Temporary storage 82 is used to store one orplurality of the stream of preview images and can be part of the imagestore 80 or a separate component. The preview image is usually generatedby the image capture device 60. For speed and memory efficiency reasons,preview images usually have a lower pixel resolution than the main imagetaken when the shutter button is fully depressed, and are generated bysub-sampling a raw captured image using software 124 which can be partof the general processor 120 or dedicated hardware or combinationthereof.

In the present embodiment, a face detection and tracking module 130 suchas described in U.S. application Ser. No. 11/1,464,083, filed Aug. 11,2006, which is hereby incorporated by reference, is operably connectedto the sub-sampler 124 to control the sub-sampled resolution of thepreview images in accordance with the requirements of the face detectionand tracking module. Preview images stored in temporary storage 82 areavailable to the module 130 which records the locations of faces trackedand detected in the preview image stream. In one embodiment, the module130 is operably connected to the display 100 so that boundaries ofdetected and tracked face regions can be superimposed on the displayaround the faces during preview.

In the embodiment of FIG. 1, the face tracking module 130 is arranged toextract and store tracked facial regions at relatively low resolution ina memory buffer such as memory 82 and possibly for storage as meta-datain an acquired image header stored in memory 80. Where multiple faceregions are tracked, a buffer is established for each tracked faceregion. These buffers are of finite size (10-20 extracted face regionsin a preferred embodiment) and generally operate on a first-in-first-out(FIFO) basis.

According to the preferred embodiment, the device 20 further comprisesan image correction module 90. Where the module 90 is arranged foroff-line correction of acquired images in an external processing device10, such as a desktop computer, a colour printer or a photo kiosk, faceregions detected and/or tracked in preview images are preferably storedas meta-data within the image header. However, where the module 90 isimplemented within the camera 20, it can have direct access to thebuffer 82 where preview images and/or face region information is stored.

In this embodiment, the module 90 receives the captured high resolutiondigital image from the store 80 and analyzes it to detect defects. Theanalysis is performed as described in the embodiments to follow. Ifdefects are found, the module can modify the image to remove the defect.The modified image may be either displayed on image display 100, savedon a persistent storage 112 which can be internal or a removable storagesuch as CF card, SD card or the like, or downloaded to another devicevia image output means 110 which can be tethered or wireless. The module90 can be brought into operation either automatically each time an imageis captured, or upon user demand via input 30. Although illustrated as aseparate item, where the module 90 is part of the camera, it may beimplemented by suitable software on the processor 120.

The main components of the image correction module include a qualitymodule 140 which is arranged to analyse face regions from either low orhigh resolution images to determine if these include face defects. Asuper-resolution module 160 is arranged to combine multiplelow-resolution face regions of the same subject generally with the samepose and a desirable facial expression to provide a high quality faceregion for use in the correction process. In the present embodiment, anactive appearance model (AAM) module 150 produces AAM parameters forface regions again from either low or high resolution images.

AAM modules are well known and a suitable module for the presentembodiment is disclosed in “Fast and Reliable Active Appearance ModelSearch for 3-D Face Tracking”, F Dornaika and J Ahlberg, IEEETransactions on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol.34, No. 4, pg. 1838-1853, August 2004, although other models based onthe original paper by T F Cootes et al “Active Appearance Models” Proc.European Conf. Computer Vision, 1998, pp 484-498 could also be employed.

The AAM module 150 can preferably cooperate with the quality module 140to provide pose and/or expression indicators to allow for selection ofimages in the analysis and optionally in the correction processdescribed below. Also, the AAM module 150 can preferably cooperate withthe super-resolution module 160 to provide pose indicators to allow forselection of images in the correction process, again described in moredetail below.

Referring now to FIG. 2, which illustrates an exemplary processing flowfor certain embodiments, when a main image is acquired, step 230, thelocation and size of any detected/tracked face region(s) in the mainacquired image (high resolution) will be known by the module 90 from themodule 130. Face detection can either be applied directly on theacquired image and/or information for face regions previously detectedand/or tracked in the preview stream can be used for face detection inthe main image (indicated by the dashed line extending from step 220).At step 250, the facial region quality analysis module 140 extracts andanalyzes face regions tracked/detected at step 240 in the main image todetermine the quality of the acquired face regions. For example, themodule 140 can apply a preliminary analysis to measure the overallcontrast, sharpness and/or texture of detected face region(s). This canindicate if the entire face region was blurred due to motion of thesubject at the instant of acquisition. If a facial region is notsufficiently well defined then it is marked as a blur defect. Inadditional or alternatively, another stage of analysis can focus on theeye region of the face(s) to determine if one, or both eyes were fullyor partially closed at the instant of acquisition and the face region iscategorized accordingly. As mentioned previously, if AAM analysis isperformed on the image, then the AAM parameters can be used to indicatewhether a subject's eyes are open or not. It should be noted that in theabove analyses, the module 90 detects blink or blur due to localizedmovement of the subject as opposed to global image blur.

Another or alternative stage of analysis focuses on the mouth region anddetermines if the mouth is opened in a yawn or indeed not smiling; againthe face region is categorized accordingly. As mentioned previously, ifAAM analysis is performed on the image, then the AAM parameters can beused to indicate the state of a subject's mouth.

Other exemplary tests might include luminance levels, skin colour andtexture histograms, abrupt facial expressions (smiling, frowning) whichmay cause significant variations in facial features (mouth shape,furrows in brow). Specialized tests can be implemented as additional oralternative image analysis filters, for example, a Hough transformfilter could be used to detect parallel lines in a face region above theeyes indicating a “furrowed brow”. Other image analysis techniques suchas those known in the art and as disclosed in U.S. Pat. No. 6,301,440can also be employed to categorise the face region(s) of the main image.

After this analysis, it is decided (for each face region) if any ofthese defects occurred, step 260, and the camera or external processingdevice user can be offered the option of repairing the defect based onthe buffered (low resolution) face region data, step 265.

When the repair option is actuated by the user, each of thelow-resolution face regions is first analyzed by the face region qualityanalyzer, step 270. As this analysis is operative on lower resolutionimages acquired and stored at steps 200/210, the analysis may vary fromthe analysis of face regions in the main acquired image at step 250.Nevertheless the analysis steps are similar in that each low-resolutionface region is analyzed to determine if it suffers from image defects inwhich case it should not be selected at step 280 to reconstruct thedefective face region(s) in the main image. After this analysis andselection, if there are not enough “good” face regions corresponding toa defective face region available from the stream of low-resolutionimages, an indication is passed to the user that image repair is notviable. Where there are enough “good” face regions, these are passed onfor resizing and alignment, step 285.

This step re-sizes each face region and performs some local alignment ofcardinal face points to correct for variations in pose and to ensurethat each of the low-resolution face regions overlap one another asuniformly as is practical for later processing.

It should also be noted that as these image regions were captured insequence and over a relatively short duration, it is expected that theyare of approximately the same size and orientation. Thus, imagealignment can be achieved using cardinal face points, in particularthose relating to the eyes, mouth, and lower face (chin region) which isnormally delineated by a distinct boundary edge, and the upper facewhich is normally delineated by a distinctive hairline boundary. Someslight scaling and morphing of extracted face regions may be used toachieve reasonable alignment, however a very precise alignment of theseimages is not desirable as it would undermine the super-resolutiontechniques which enable a higher resolution image to be determined fromseveral low-resolution images.

It should be noted that the low-resolution images captured and stored atsteps 2001210 can be captured either from a time period before capturingthe main image or from a period following capture of the main image(indicated by the dashed line extending from step 230). For example, itmay be possible to capture suitable defect free low resolution images ina period immediately after a subject has stopped moving/blinking etc.following capture of the main Image.

This set of selected defect free face regions is next passed to asuper-resolution module 160 which combines them using knownsuper-resolution methods to yield a high resolution face region which iscompatible with a corresponding region of the main acquired image.

Now the system has available to it, a high quality defect-freecombination face region and a high resolution main image with agenerally corresponding defective face region.

If this has not already been performed for quality analysis, thedefective face region(s) as well as the corresponding high qualitydefect-free face region are subjected to AAM analysis, step 300.Referring now to FIG. 3( a) to (d), which illustrates some imagesincluding face regions which have been processed by the AAM module 150.In this case, the model represented by the wire frame superimposed onthe face is tuned for a generally forward facing and generally uprightface, although separate models can be deployed for use with inclinedfaces or faces in profile. Once the model has been applied, it returns aset of coordinates for the vertices of the wire frame; as well astexture parameters for each of the triangular elements defined byadjacent vertices. The relative coordinates of the vertices as well asthe texture parameters can in turn provide indicators linked to theexpression and inclination of the face which can be used in qualityanalysis as mentioned above.

It will therefore be seen that the AAM module 150 can also be used inthe facial region analysis steps 250/270 to provide in indicator ofwhether a mouth or eyes are open i.e. smiling and not blinking; and alsoto help determine in steps 285/290 implemented by the super-resolutionmodule 160 whether facial regions are similarly aligned or inclined forselection before super-resolution.

So, using FIG. 3( a) as an example of a facial region produced bysuper-resolution of low resolution images, it is observed that the setof vertices comprising the periphery of the AAM model define a regionwhich can be mapped on to corresponding set of peripheral vertices ofFIG. 3( b) to FIG. 3( d) where these images have been classified andconfirmed by the user as defective facial regions and candidates forcorrection.

In relation to FIG. 4, the model parameters for FIG. 4( a) or 4(b) whichmight represent super-resolved defect free face regions could indicatethat the left-right orientation of these face regions would not makethem suitable candidates for correcting the face region of FIG. 4( c).Similarly, the face region of FIG. 4( f) could be a more suitablecandidate than the face region of FIG. 4( e) for correcting the faceregion of FIG. 4( d).

In any case, if the super-resolved face region is deemed to becompatible with the defective face region, information from thesuper-resolved face region can be pasted onto the main image by anysuitable technique to correct the face region of the main image, step320. The corrected image can be viewed and depending on the nature ofthe mapping, it can be adjusted by the user, before being finallyaccepted or rejected, step 330. So for example, where dithering aroundthe periphery of the corrected face region is used as part of thecorrection process, step 320, the degree of dithering can be adjusted.Similarly, luminance levels or texture parameters in the correctedregions can be manually adjusted by the user, or indeed any parameter ofthe corrected region and the mapping process can be manually adjustedprior to final approval or rejection by the user.

While AAM provides one approach to determine the outside boundary of afacial region, other well-known image processing techniques such as edgedetection, region growing and skin color analysis may be used inaddition or as alternatives to AAM. However, these may not have theadvantage of also being useful in analysing a face region for defectsand/or for pose information. Other techniques which can prove usefulinclude applying foreground/background separation to either thelow-resolution images or the main image prior to running face detectionto reduce overall processing time by only analysing foreground regionsand particularly foreground skin segments. Local colour segmentationapplied across the boundary of a foreground/background contour canassist in further refining the boundary of a facial region.

Once the user is satisfied with the placement of the reconstructed faceregion they may choose to merge it with the main image; alternatively,if they are not happy they can cancel the reconstruction process. Theseactions are typically selected through buttons on the camera userinterface where the correction module is implemented on the acquisitiondevice 20.

As practical examples let us consider an example of the system used tocorrect an eye defect. An example may be used of a defect where one eyeis shut in the main image frame due to the subject “blinking” during theacquisition. Immediately after the main image acquisition the user isprompted to determine if they wish to correct this defect. If theyconfirm this, then the camera begins by analyzing a set of face regionsstored from preview images acquired immediately prior to the main imageacquisition. It is assumed that a set of, say, 20 images was saved fromthe one second period immediately prior to image acquisition. As thedefect was a blinking eye, the initial testing determines that the last,say, 10 of these preview images are not useful. However the previous 10images are determined to be suitable. Additional testing of these imagesmight include the determination of facial pose, eliminating images wherethe facial pose varies more than 5% from the averaged pose across allpreviews; a determination of the size of the facial region, eliminatingimages where the averaged size varies more than 25% from the averagedsize across all images. The reason the threshold is higher for thelatter test is that it is easier to rescale face regions than to correctfor pose variations.

In variations of the above described embodiment, the regions that arecombined may include portions of the background region surrounding themain face region. This is particularly important where the defect to becorrected in the main acquired image is due to face motion during imageexposure. This will lead to a face region with a poorly defined outerboundary in the main image and the super-resolution image which issuperimposed upon it typically incorporates portions of the backgroundfor properly correcting this face motion defect. A determination ofwhether to include background regions for face reconstruction can bemade by the user, or may be determined automatically after a defectanalysis is performed on the main acquired image. In the latter case,where the defect comprises blurring due to face motion, then backgroundregions will normally be included in the super-resolution reconstructionprocess. In an alternative embodiment, a reconstructed background can becreated using either (i) region infilling techniques for a backgroundregion of relatively homogeneous colour and texture characteristics, or(ii) directly from the preview image stream using image alignment andsuper-resolution techniques. In the latter case the reconstructedbackground is merged into a gap in the main image background created bythe separation of foreground from background; the reconstructed faceregion is next merged into the separated foreground region, specificallyinto the facial region of the foreground and finally the foreground isre-integrated with the enhanced background region.

After applying super-resolution methods to create a higher resolutionface region from multiple low-resolution preview images, some additionalscaling and alignment operations are normally involved. Furthermore,some blending, infilling and morphological operations may be used inorder to ensure a smooth transition between the newly constructedsuper-resolution face region and the background of the main acquiredimage. This is particularly the case where the defect to be corrected ismotion of the face during image exposure. In the case of motion defectsit may also be desirable to reconstruct portions of the image backgroundprior to integration of the reconstructed face region into the mainimage.

It is also desirable to match the overall luminance levels of the newface region with that of the old face region, and this is best achievedthrough a matching of the skin colour between the old region and thenewly constructed one. Preview images are acquired under fixed camerasettings and can be over/under exposed. This may not be fullycompensated for during the super-resolution process and may involveadditional image processing operations.

While the above described embodiments have been directed to replacingface regions within an image, it will be seen that AAM can be used tomodel any type of feature of an image. So in certain embodiments, thepatches to be used for super-resolution reconstruction may besub-regions within a face region. For example, it may be desired toreconstruct only a segment of the face regions, such as an eye or mouthregion, rather than the entire face region. In such cases, adetermination of the precise boundary of the sub-region is of lessimportance as the sub-region will be merged into a surrounding region ofsubstantially similar colour and texture (i.e. skin colour and texture).Thus, it is sufficient to center the eye regions to be combined or toalign the corners of the mouth regions and to rely on blending thesurrounding skin coloured areas into the main image.

In one or more of the above embodiments, separate face regions may beindividually tracked (see also U.S. application Ser. No. 11/1,464,083,which is hereby incorporated by reference). Regions may be tracked fromframe-to-frame. Preview or post-view face regions can be extracted,analyzed and aligned with each other and with the face region in themain or final acquired image. In addition, in techniques according tocertain embodiments, faces may be tracked between frames in order tofind and associate smaller details between previews or post-views on theface. For example, a left eye from Joe's face in preview N may beassociated with a left eye from Joe's face in preview N+1. These may beused together to form one or more enhanced quality images of Joe's eye.This is advantageous because small features (an eye, a mouth, a nose, aneye component such as an eye lid or eye brow, or a pupil or iris, or anear, chin, beard, mustache, forehead, hairstyle, etc. are not as easilytraceable between frames as larger features and their absolute orrelative positional shifts between frames tend to be more substantialrelative to their size.

The present invention is not limited to the embodiments described aboveherein, which may be amended or modified without departing from thescope of the present invention as set forth in the appended claims, andstructural and functional equivalents thereof.

In methods that may be performed according to preferred embodimentsherein and that may have been described above and/or claimed below, theoperations have been described in selected typographical sequences.However, the sequences have been selected and so ordered fortypographical convenience and are not intended to imply any particularorder for performing the operations.

In addition, all references cited above herein, in addition to thebackground and summary of the invention sections themselves, are herebyincorporated by reference into the detailed description of the preferredembodiments as disclosing alternative embodiments and components. Thefollowing are also incorporated by reference for this purpose: U.S.patent applications Nos. 60/829,127, 60/804,546, 60/821,165 Ser. Nos.11/1,554,539, 11/464,083, 11/027,001, 10/842,244, 11/024,046,11/233,513, 11/460,218, 11/573,713, 11/319,766, 11/464,083, 10/744,020and 11/460,218, and U.S. published application no. 2006/0285754.

What is claimed is:
 1. A computerized method comprising: receiving aplurality of images of approximately a same scene; detecting at least afirst object and a second object within a first image from among theplurality of images; detecting at least a third object and a fourthobject within a second image from among the plurality of images;tracking the first object in the first image as corresponding to thethird object in the second image; tracking the second object in thefirst image as corresponding to the fourth object in the second image;wherein each of the first object, the second object, the third object,and the fourth object is distinguishable from each other.
 2. The methodof claim 1, further comprising: detecting at least a fifth object withina third image from among the plurality of images based on the trackingof the first object and the tracking of the second object; identifyingthe fifth object as corresponding to either the first object and thethird object or the second object and the fourth object.
 3. The methodof claim 2, further comprising: when the fifth object corresponds to thefirst object and the third object, correcting the third image byreplacing at least the fifth object within the third image with thefirst object and the third object; when the fifth object corresponds tothe second object and the fourth object, correcting the third image byreplacing at least the fifth object within the third image with thesecond object and the fourth object.
 4. The method of claim 2, whereinthe first image, the second image, or the third image has at least oneof a different focus setting, exposure level, or resolution from eachother.
 5. The method of claim 1, further comprising: extracting thefirst object and the second object from the first image; extracting thethird object and the fourth object from the second image; storing thefirst object and the third object in a first data store; storing thesecond object and the fourth object in a second data store.
 6. Themethod of claim 1, wherein the tracking of the first object comprisesidentifying a first location of the first object within the first imageand a third location of the third object within the second image.
 7. Themethod of claim 1, wherein the first object comprises a face, a portionof a face, a facial feature, an eye, a mouth, a nose, a part of an eye,an eye lid, an eye brow, a pupil, an iris, an ear, a chin, a beard, amustache, a forehead, or a hairstyle.
 8. An apparatus comprising: amemory including a plurality of images of approximately a same scene; aprocessor in communication with the memory, the processor configured to:detect at least a first object and a second object within a first imagefrom among the plurality of images; detect at least a third object and afourth object within a second image from among the plurality of images;track the first object in the first image as corresponding to the thirdobject in the second image; track the second object in the first imageas corresponding to the fourth object in the second image; wherein eachof the first object, the second object, the third object, and the fourthobject is distinguishable from each other.
 9. The apparatus of claim 8,wherein the processor is further configured to: detect at least a fifthobject within a third image from among the plurality of images based onthe tracking of the first object and the tracking of the second object;identify the fifth object as corresponding to either the first objectand the third object or the second object and the fourth object.
 10. Theapparatus of claim 9, wherein the processor is further configured to:when the fifth object corresponds to the first object and the thirdobject, correct the third image by replacing at least the fifth objectwithin the third image with the first object and the third object; whenthe fifth object corresponds to the second object and the fourth object,correct the third image by replacing at least the fifth object withinthe third image with the second object and the fourth object.
 11. Theapparatus of claim 9, wherein the first image, the second image, or thethird image has at least one of a different focus setting, exposurelevel, or resolution from each other.
 12. The apparatus of claim 8,wherein the processor is further configured to: extract the first objectand the second object from the first image; extract the third object andthe fourth object from the second image; store the first object and thethird object in a first data store included in the memory; store thesecond object and the fourth object in a second data store included inthe memory.
 13. The apparatus of claim 8, wherein the processor tracksthe first object by identifying a first location of the first objectwithin the first image and a third location of the third object withinthe second image.
 14. The apparatus of claim 8, wherein the first objectcomprises a face, a portion of a face, a facial feature, an eye, amouth, a nose, a part of an eye, an eye lid, an eye brow, a pupil, aniris, an ear, a chin, a beard, a mustache, a forehead, or a hairstyle.15. The apparatus of claim 8, further comprising a lens and an imagesensor to acquire the plurality of images.
 16. A computerized methodcomprising: receiving a plurality of images of approximately a samescene; detecting at least a first object within a first image from amongthe plurality of images; tracking the first object within the firstimage to be a second object within a second image from among theplurality of images; tracking the second object within the second imageto be a third object within a third image from among the plurality ofimages.
 17. The method of claim 16, wherein the tracking of the secondobject comprises detecting the third object within the third image basedon the first object or the second object.
 18. The method of claim 16,further comprising replacing the third object within the third imagewith the first object and the second object.
 19. The method of claim 16,wherein the tracking of the first object comprises identifying a firstlocation of the first object within the first image and a secondlocation of the second object within the second image.
 20. The method ofclaim 16, wherein the first image, the second image, or the third imagehas at least one of a different focus setting, exposure level, orresolution from each other.
 21. The method of claim 16, wherein thefirst object, the second object, and the third object comprise a sameobject.
 22. An apparatus comprising: a memory including a plurality ofimages of approximately a same scene; a processor in communication withthe memory, the processor configured to: detect at least a first objectwithin a first image from among the plurality of images; track the firstobject within the first image to be a second object within a secondimage from among the plurality of images; track the second object withinthe second image to be a third object within a third image from amongthe plurality of images.
 23. The apparatus of claim 22, wherein theprocessor tracks the second object to detect the third object within thethird image based on the first object or the second object.
 24. Theapparatus of claim 22, wherein the processor replaces the third objectwithin the third image with the first object and the second object. 25.The apparatus of claim 22, wherein the third object comprises a face, aportion of a face, a facial feature, an eye, a mouth, a nose, a part ofan eye, an eye lid, an eye brow, a pupil, an iris, an ear, a chin, abeard, a mustache, a forehead, or a hairstyle.